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The  coding,  storage,  and  reconstruction  of  images  is  a  major  concern  in  the  ap¬ 
plication  of  computer  technology  to  technical  and  scientific  problems.  One  example 
is  the  flood  of  geophysical  and  intelligence  data  originating  from  satellite  platforms. 
In  such  applications  i ( is  highly  desirable  to  reduce  the  storage  and  transmission 
requirements  for  iirjage  data.  An  image  can  be  coded  compactly  when  it  is  pos¬ 
sible^  to  exploit  self  similar  redundancy  in  the  image.  The  development  of  such  a 
so-called""” fractal”  method  for  compressing  image  data  has  been  the  focus  of  our 
research  project.^) 

Our  approach  to  image  compression  has  been  to  tessellate  the  image  with  a 
tiling  which  varies  with  the  local  image  complexity,  and  to  check  for  self  similarity 
amongst  the  tiles.  Self  similarities  are  coded  as  systems  of  affine  transformations 
which  can  be  stored  far  more  compactly  than  the  original  image.  This  method 
is  inherently  lossy,  since  the  self  similarities  are  never  exact.  Although  the  tiling 
technique  yields  good  results  in  many  cases,  we  have  also  begun  to  investigate 
contour  schemes  which  may  lead  to  irregu-ar  tilings  with  even  better  compression 
ratios,  computation  time  and  signal-to-noise  ratios. 

We  have  tested  our  encoding  scheme  on  a  variety  of  test  images,  gaining  com¬ 
pression  ratios  greater  than  40:1.  At  high  compression  ratios,  the  scheme  yields 
better  signal  to  noise  ratios  than  are  reported  icv  other  techniques.  Our  scheme 
is_ versatile  in  that  it  allows  a  trade  off  between  compression,  reconstructed  image 
fidelity  and  encoding  time.  Our  methods  are  computationally  intensive  but  are 
feasible  for-  non-real  time  applications  on  workstations  or  main  frame  computers. 
The  algorithms  can  Be  accelerated  considerably  by  dedicated  hardware  for  real  time 
requirements. 

Fractal  compression  is  a  promising  approach  to  image  compression.  Within  a 
very  short  development  time,  fractal  techniques  have  yielded  results  which  rival  the 
best  examples  of  data  compression  afforded  by  other  methods.  Although  fractal 
encoding  of  images  is  complex  and  may  require  specialized  hardware  for  real  time 
applications,  the  decoding  process  can  be  widely  utilized  because  it  is  simple,  fast, 
and  suitaible  for  software  implementation.  Thus,  it  can  be  run  on  workstations  or 
personal  computers  without  special  requirements.  - - - 

We  recommend  further  research  and  development  along  the  following  lines. 
First,  the  tiling  scheme  is  sufficiently  mature  to  consider  hardware  implementation 
for  possible  real  time  applications.  Second,  we  expect  that  the  application  of  fractal 
techniques  with  other  image  processing  methods,  such  as  contour  detection,  will 
lead  to  even  better  results.  Finally,  research  into  the  mathematical  foundations  of 
the  subject  is  warranted.  We  believe  that  a  program  which  integrates  hardware 
engineering,  software  development  and  further  mathematical  research  will  yield  the 
best  results  in  the  long  run. 
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Section  1.0.  Introduction. 


The  objective  of  the  Phase  I  research  was  “to  improve  [lossy]  image  compression 
techniques  which  are  based  on  fractal  constructions.”  This  was  an  ambitious  un¬ 
dertaking,  since  at  the  time  of  the  proposal  no  such  techniques  were  widely  known. 
In  fact,  although  claims  about  “fractal  image  compression”  existed,  there  were  no 
bench  mark  results  and  no  hard  data  on  such  schemes. 

Thus,  the  Phase  I  research  set  out  to  prove  the  feasibility  of  such  a  scheme, 
and  the  conclusion  of  this  research  effort  is:  such  a  scheme  is  feasible.  We  have 
implemented  a  working,  completely  automatic  image  compression  algorithm.  The 
algorithm  yields  compression  results  that  compare  favorably  to  other  state  of  the 
art  techniques. 

The  work  of  the  last  six  months  at  NETROLOGIC  can  be  broken  down  into 
two  parts.  The  first  part  demonstrates  automatic  encoding  of  digital  image  data  by 
storing  an  approximation  of  the  image  as  a  collection  of  affine  transformations  of  the 
plane.  The  amount  of  memory  required  to  store  the  transformations  is  considerably 
smaller  than  the  amount  of  memory  required  to  store  the  original  image,  and  the 
transformations  can  be  converted  back  to  an  image  by  a  simple  and  fast  procedure. 

The  second  part  of  the  work  deals  with  alternative  schemes  to  compress  images, 
some  transfer  m  based  and  some  not.  Since  the  first  part  already  demonstrated  our 
ability  to  e_.  Me  images  as  transformations,  the  results  from  this  work  are  beyond 
the  scope  of  the  original  Phase  I  proposal.  Since  these  schemes  are  preliminary, 
they  should  be  considered  proprietary.  Like  the  technique  in  the  first  part,  these 
techniques  are  based  on  completely  new  algorithms  developed  at  NETROLOGIC 
and  the  University  of  California,  San  Diego. 

One  important  note  remains  to  be  made:  Although  the  work  demonstrates 
the  feasibility  of  encoding  images  as  transformations,  the  theoretical  foundation 
for  the  subject  is  requires  more  development.  The  major  theorem  in  the  field  is 
the  contractive  mapping  fixed  point  theorem,  which  is  unsatisfactory  because  the 
bounds  it  gives  on  certain  rates  of  convergence  do  not  come  close  to  the  actual 
convergence  rates.  This  suggests  that  we  do  not  understand  the  mechanism  that 
controls  the  degree  to  which  an  image  can  be  encoded.  Until  the  subject  has  a 
well  formulated  and  successful  theoretical  foundation,  the  results  will  continue  to 
be  largely  empirical  and  less  than  satisfactory. 

Section  2.0.  Background. 

The  following  is  a  brief  history  of  the  development  of  the  subject.  Hutchinson 
[5]  introduced  the  theory  of  iterated  functions  systems  (a  term  coined  by  Barnsley) 
to  model  self  similar  sets  (such  as  in  figure  3).  Demko,  Hodges,  and  Naylor  [6] 
first  suggested  using  iterated  function  systems  to  model  complex  objects  in  com¬ 
puter  graphics.  Barnsley,  Demko,  Elton,  Sloan  and  others  generalized  the  concepts 
and  suggested  the  use  of  fractals  to  model  “natural  scenes”.  In  his  thesis  [7],  A. 
Jacquin  developed  an  image  encoding  scheme  based  on  iterated  Markov  operators 


on  measure  spaces  and  used  it  to  encode  6  bit/pixel  monochrome  images. 

M.  Barnsley  (see  [2]  or  [3])  is  credited  with  stimulating  interest  in  image  com¬ 
pression  based  on  “fractal”  techniques.  While  much  effort  has  been  spent  on  the 
use  of  fractals  to  generate  images,  Barnsley  popularized  the  converse  notion  of  en¬ 
coding  the  content  of  images  using  fractals.  Although  many  claims  were  made  in 
this  field  recently,  results  based  on  rigorous  test  procedures  are  almost  nonexistent. 
In  an  effort  to  bring  results  and  details  to  scientific  scrutiny,  work  was  begun  at 
the  University  of  California,  San  Diego  (UCSD),  the  Naval  Ocean  Systems  Center 
(NOSC),  in  addition  to  NETROLOGIC.  The  result  of  the  work  at  NETROLOGIC 
and  the  current  research  is  presented  in  section  4.0. 

The  subject  is  still  in  its  infancy  -  the  most  important  observation  is  that  the 
theoretical  foundation  is  still  very  weak  and  that  the  underlying  mechanism  driving 
the  encoding  process  is  not  well  understood.  The  existing  theoretical  foundations 
can  be  found  in  the  next  section. 

Section  3.0.  Theory. 

In  this  section  we  motivate  the  use  of  affine  maps  to  encode  images.  The  main 
tool  of  the  subject  is  an  old  but  powerful  theorem,  the  contractive  mapping  fixed 
point  theorem.  First  we  need  a  definition. 

Definition.  Let  S  be  a  metric  on  a  space  F.  A  map  W  :  F  -*  F  is  said  to  be  a 
contraction  if  there  exists  a  positive  real  number  s  <  1  such  that 

6(W(x),W(y))<s6(x,y) 

for  any  two  points  x,y  E  F.  If  s  <  1  then  W  is  said  to  be  a  strict  contraction. 

Theorem  (Contractive  Mapping  Fixed  Point).  Let  F  be  a  complete  metric 
space  with  metr.c  6.  IfW:F—*Fisa  contraction,  then  there  exists  a  unique 
point  g  €  F  such  that  g  =  W(g).  Moreover,  for  any  f  E  F,  the  fixed  point  is  the 
limit  g  =  Won(f). 

In  the  next  section  we  apply  this  theorem  to  produce  a  simple  example  of  image 
compression. 


Section  3.1.  A  Simple  Example. 

The  following  example  shows  a  simple  application  of  the  contractive  mapping 
fixed  point  theorem.  In  this  example  F  is  the  space  of  compact  subsets  of  R2  and  S 
is  the  Hausdorff  metric,  then  (F,  6)  is  a  complete  metric  space.  (The  exact  definition 
of  the  Hausdorff  metric  is  not  important,  it  is  sufficient  to  think  of  it  as  measuring 
the  extent  to  which  two  sets  in  the  plane  overlap).  We  can  then  define  the  three 
transformations  shown  in  figure  1: 


For  any  set  S,  let 


(1) 


Figure  1.  Three  affine  transformations  in  the  plane. 

Consider  the  following  process:  begin  with  the  square  Ao  =  {(£,  y)  :  0  < 
x  <  1,0  <  y  <  1},  as  shown  on  the  left  of  figure  1.  Let  A\  =  W(.Ao)  and  in 
general  An  =  W(yln_i).  Figure  2  shows  A^Ai^Az  and  A±.  The  sets  appear  to 
•  be  converging  to  a  limit  set  shown  in  figure  3.  In  fact,  the  maps  W{  are  strictly 

contractive  in  the  euclidean  metric,  and  it  is  not  hard  to  show  that  this  implies 
that  the  map  W  is  contractive  in  the  Hausdorff  metric.  As  n  -»  00,  the  sets  An 
converges  (in  the  Hausdorff  metric)  to  a  limit  set  A0 o,  shown  in  figure  3.  Moreover, 
for  any  compact  set  S  C  R2,  Hron(5')  — »  A0 0  as  n  — ►  00. 


Figure  2.  A\  =  W{Aq)  and  its  images  ^,>13,  and  A4. 
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That  all  compact  initial  sets  converge  under  iteration  to  .Ac©  is  important  -  it 
means  that  the  set  Aoo  is  defined  by  the  to,-  only.  Also,  it  is  not  difficult  to  see  why 
such  a  thing  is  true.  The  Wi  are  contractive;  they  halve  the  diameter  of  any  set  to 
which  they  are  applied.  Thus,  the  images  of  any  initial  Aq  will  shrink  to  points  in 
the  limit  as  the  to,-  are  repeatedly  applied. 


Each  Wi  is  determined  by  6  re:U  values,  so  that  for  this  example  we  require 
18  floating  point  numbers  to  define  the  image.  In  single  precision,  this  requires  72 
bytes.  The  memory  required  to  store  an  image  of  the  set  depends  on  the  resolution; 
figure  3  requires  256  x  256  x  1 bit  =  8192  bytes  of  memory.  The  resulting  compression 
ratio  in  our  particular  example  is  thus  114  : 1.  However,  in  this  particular  example, 
the  limit  set  is  a  fractal  and  can  thus  be  decoded  to  any  resolution.  The  resulting 
compression  ratio  can  be  honestly  said  to  be  infinite.  In  applications,  however,  it  is 
less  than  honest  to  decode  an  image  at  a  different  size  than  it  was  encoded  at  and  to 
then  claim  very  high  compression  rates.  Just  as  with  the  fractal  in  this  example,  the 
fractal  compression  scheme  we  describe  later  will  generate  detail  at  all  scale  levels, 
even  though  such  detail  is  not  present  in  the  original  image.  It  is  very  important 
to  include  both  the  original  image  size  and  the  signal  to  noise  ratio  (or  some  other 
measure  of  error)  when  giving  results  for  any  image  encoding  method  and  “fractal” 
image  encoding  methods  in  particular. 

Section  3.2.  The  Space  of  Images. 

The  space  in  which  we  work  when  compressing  images  can  be  defined  in  the 
following  way.  We  denote  the  closed  interval  [0, 1]  by  /,  and  the  n-fold  Cartesian 
product  of  I  with  itself  by  Let  F  be  the  space  consisting  of  all  graphs  of  real 
Lebesgue  measurable  functions  z  =  f(x,  y)  with  (x,y,/(x,  y))  6  I3.  Thus  /  is 
bounded.  We  think  of  a  point  in  F  as  an  abstract  image  of  infinite  resolution, 
with  /(x,y)  representing  the  grey  level  (with  0  being  black  and  1  being  white) 


at  the  point  (x,y)  in  the  image.  We  can  model  images  with  finite  resolutions  by 
partitioning  P  with  a  rectilinear  grid  and  either  insisting  that  /  be  constant  on 
the  boxes  of  the  grid,  or  by  averaging  /  over  each  box.  The  model  with  infinite 
resolution  allows  us  to  handle  the  theory  somewhat  more  clearly.  Color  images  can 
be  encoded  as  graphs  of  functions  /  :  P  — ►  P  with  range  points  representing  the 
color  model  of  choice,  for  example  RGB  values. 

We  metrize  F  using  the  metric  induced  by  the  essential  supremum: 

ll/lloo  =  inf{<*  =  /*(/_1((a>  1]))  =  0}- 

The  metric  we  wish  to  use  is 

*(/.»)  =  ill/ -»i  ii- 

The  space  F  with  the  metric  8  is  complete,  allowing  us  to  use  the  contractive 
mapping  fixed  point  theorem,  when  we  identify  images  which  have  distance  0.  Other 
spaces  of  images  Eire  possible,  for  example  the  space  of  positive  Borel  measures 
supported  on  P ,  but  this  space  is  difficult  to  metrize,  and  the  particular  space  and 
metrization  is  of  little  practical  consequence  ultimately.  This  is  especially  true  in 
light  of  the  inability  of  the  theory  to  bound  the  converges  rates  even  weakly,  as  we 
show  later. 

We  now  return  to  the  application  of  the  contractive  mapping  fixed  point  the¬ 
orem  to  the  problem  at  hand.  Let  W  =  U-LjtOi  :  F  ~+  F  denote  some  contractive 
map,  which  we  assume  is  built  up  of  a  union  of  local  maps  tu,-  :  F  -*  F  (as  in 
equation  1,  for  example). 

Following  Hutchinson’s  notation  [5],  we  denote  the  fixed  point  g  =  |W|  = 
lim^ooW0^/).  Then 


|W|  =  W(\w I)  =  (J  »i(|W|).  (2) 

«'=1 

We  say  that  W  encodes  an  image  /  €  F  if  /  =  |W|.  Given  W ,  it  is  easy  to 
find  the  image  that  it  encodes  -  begin  with  any  image  fo  and  successively  compute 
W(f0),  W(W(fo)), . . .  until  the  images  converge  to  |W|  (just  as  in  the  example  in 
section  3.1).  The  converse  is  considerably  more  difficult:  given  an  image  /,  how 
do  we  find  a  mapping  W  such  that  |W|  =  /  ?  We  know  of  no  general,  non-trivial 
solution  to  this  problem,  nor  do  we  expect  that  one  exists.  We  attempt  instead 
to  find  an  image  f  €  F  such  that  8(f,  f)  is  minimal  with  f1  =  |W|.  Equation 
2  suggests  how  this  might  be  possible.  We  seek  domains  Di , . . . ,  Dn  C  F  and 
corresponding  transformations  Wi , . . . ,  wn  :  F  — >  F  such  that 

f  »  W(f )  =  U  ®il D,U). 


(3) 


This  equation  says:  cover  /  with  parts  of  itself;  the  parts  are  defined  by  the  D{ 
and  the  way  those  parts  cover  /  is  determined  by  the  to;.  Equality  in  equation  (3) 
would  imply  that  /  =  |W|.  Since  we  cannot  cover  /  exactly  with  parts  of  itself,  we 
try  to  do  the  best  we  can  and  hope  that  |W|  and  /  will  not  look  too  different,  i.e. 
that  5(|W|,/)  is  small.  The  following  observation,  which  is  due  to  Barnsley  [1]  and 
which  he  calls  the  Collage  Theorem,  gives  us  hope  that  this  can  be  done.  It  is  a 
corollary  of  the  contractive  mapping  fixed  point  theorem. 

Corollary.  Let  W  :  F  -»  F  be  a  contraction  with  contractivity  s  and  let  f  6  F  be 
an  image.  Then  5(|W|,  f)  <  jh-AW(f)’  /)• 

Our  problem  is  to  find  a  W  such  that  6(W(f),f)  is  minimized  and  such  that 
s  is  small.  We  will  then  know  that  |W|  is  close  (in  6)  to  /.  However,  the  bound  in 
the  corollary  is  not  very  good;  it  provides  motivation  only  and  not  a  useful  bound 
in  practice.  In  fact,  it  is  possible  to  generate  examples  in  which  the  bound  in  the 
corollary  is  arbitrarily  large  while  5(|W|,/)  is  bounded.  We  have  found  empirically 
that  restricting  s  to  be  small  results  in  poorer  encodings,  because  while  the 
term  decreases,  5(W(/),/)  increases.  Table  1  demonstrates  this  phenomenon. 

Remark:  We  can  always  approximate  any  given  image  /  to  within  any  e  >  0.  Since 
the  simple  functions  (functions  whose  range  consists  of  a  finite  number  of  points) 
are  dense  in  F,  we  can  find  simple  functions  <71, . . .  ,gn  such  that  Sp(Y^gi,f)  <  e. 
For  details  see  [8].  We  can  then  construct  maps  wi , . . . ,  wn  and  a  map  W  using  the 
simple  functions  gi.  This  is  not  a  deep  point,  because  we  sacrifice  compression  in 
order  to  achieve  accuracy  -  that  is,  we  require  a  large  number  of  maps  101, . . .  ,iu„ 
in  order  to  have  6(f,  W(f))  small,  and  n  grows  rapidly  as  e  -»  0. 

We  found,  while  proving  the  convergence  of  W,  that  contractiveness  was  not 
essential.  It  is  sufficient  for  W  to  be  eventually  contractive,  meaning  that  there 
exists  some  positive  integer  m  such  that  Wom  is  contractive.  We  generalized  the 
corollary  to  give  a  new  bound  on  the  efficacy  5(|W|,/)  of  the  encoding  W. 

Generalized  Corollary.  For  /  6  F,  W°m  :  F  — >  F  contractive  with  contractivity 
o  <  1,  and  smax  the  expansiveness  of  W, 

f(l  n/)  <  r^f- 

1  -  cr  1  -  smax 

In  encoding  images,  we  restricted  ourselves  to  eventually  contractive  maps, 
since  they  yield  better  results.  Nevertheless,  the  theoretical  bound  on  the  efficacy 
of  the  encoding  6(\W\,f)  in  both  the  eventually  contractive  and  contractive  cases 
is  poor,  as  demonstrated  in  table  1.  The  metric  used  in  this  table  is  the  rms  metric. 

Table  1  gives  results  for  a  typical  portrait  image.  The  first  two  entries  in 
the  table  are  encodings  resulting  from  contractive  maps.  The  third  entry  is  an 
encoding  with  an  eventually  contractive  map.  The  table  demonstrates  two  points. 
First,  the  eventually  contractive  map  gives  the  best  encoding.  Second,  even  though 
restricting  the  contractivity  of  W  improves  the  bounds  the  contractive  mapping 


theorem  corollaries,  it  worsens  the  encoding.  This  demonstrates  that  the  bounds  in 
the  corollaries  are  poor;  hence  they  provide  motivation  rather  than  actual  bounds. 


Hm,f> 

— 

a 

m 

Col2-Coll 

Theorem 

21.965 

23.487 

mm 

0.8 

1 

1.513 

109.825 

20.306 

20.976 

SB! 

0.9 

1 

203.06 

18.937 

19.621 

4.05 

0.85 

5 

I 

720874.8 

Table  1 


In  the  implementation  of  the  algorithm,  we  restrict  ourselves  to  affine  maps  of 
the  form 

(4) 

where  the  a,-,&;,c,',dj,e;,/i,s,-,  and  o,-  are  determined  by  minimizing  6(W(f),f). 
Finding  good  values  for  these  parameters  is  the  crux  of  the  problem,  and  we  describe 
a  method  to  find  such  values  in  the  next  section. 

Section  4.0.  Results. 

This  section  contains  the  main  Phase  I  research  results.  It  is  organized  into  two 
parts;  the  first  demonstrates  the  feasibility  of  encoding  digital  images  as  a  collection 
of  “tile”  transformations.  This  scheme  is  called  tiled  transform  image  encoding.  The 
transformations  can  be  constructed  into  an  image  which  is  an  approximation  of  the 
original  image  (the  compression  scheme  is  lossy),  and  there  is  a  trade  off  between 
compression  fidelity,  the  extent  to  which  the  reconstructed  image  resembles  the 
original,  and  compression  ratio,  the  ratio  of  memory  required  to  store  the  original 
image  to  the  memory  required  to  store  the  compressed  image.  Reference  [1]  gives 
further  details. 

The  second  part  of  this  section  exhibits  results  obtained  from  alternative  image 
compression  schemes,  also  developed  during  the  Phase  I  research.  These  schemes  are 
not  as  mature  as  the  tiled  transform  encoding  scheme,  but  they  do  display  several 
attractive  features.  First,  they  provide  a  foundation  for  an  alternative  method  of 
encoding  images  as  transformations  (this  alternative  has  not  yet  been  studied). 
Second,  the  schemes  Eire  simple  and  fast  to  implement  and  execute.  And  finally,  the 
schemes  can  be  implemented  in  a  variety  of  ways,  utilizing  standard  compression 
techniques  or  new  “fractal”  based  techniques.  These  algorithms  can  be  loosely 
described  as  contour  based,  since  they  depend  on  extracting  contours  (such  as  level 
curves  or  edges)  from  an  image. 

Section  4.1.  Results  from  Phase  I. 

A  deteiiled  explanation  of  the  results  Etnd  the  image  compression  scheme  may 
be  found  in  [1].  This  text  informally  explains  the  general  idea  of  the  scheme.  The 
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compression  scheme  depends  on  the  following  premise:  Given  an  image,  for  example 
a  face,  it  is  reasonable  to  expect  that  small  portions  of  the  image,  for  example  the 
tip  of  the  nose,  resemble  larger  scale  features,  such  as  a  properly  scaled  and  rotated 
chin.  When  small  scale  features  can  be  represented  as  transformations  of  larger 
scale  features,  an  image  can  be  stored  as  the  large  scale  features  plus  the  set  of 
transformations  needed  to  define  the  small  scale  features.  Since  it  essentially  never 
happens  that  a  tip  of  a  nose  looks  exactly  like  a  skewed  chin,  the  resulting  image  is 
never  identical  to  the  original.  On  the  other  hand,  the  amount  of  memory  required 
to  store  an  image  in  this  different  way  is  often  much  less  than  the  memory  required 
to  store  the  original. 

An  image  is  digitally  stored  as  a  collection  of  values.  Each  value  represents 
a  grey  level,  for  example  0  may  be  black  and  255  may  be  white,  with  the  values 
interpolating  the  grey  level  between  these  extremes.  Successive  values  represent 
dots  or  pixels,  forming  a  matrix  which  can  be  viewed  on  a  monitor  and  which  looks 
like  an  image,  since  the  human  visual  system  tends  to  ignore  the  pixelization. 

The  main  difficulty  of  the  scheme  described  above  is  finding  the  transforma¬ 
tions.  Our  tiling  transformation  image  encoding  scheme  (TTIE)  finds  the  transfor¬ 
mations  in  the  following  way.  The  images  is  broken  up  into  tiles;  for  example,  a 
256  x  256  pixel  image  is  partitioned  into  contiguous,  non-overlapping  8x8  tiles, 
called  range  tiles.  The  image  is  also  partitioned  into  16  x  16  pixel  tiles  called  do¬ 
main  tiles.  For  each  range  tile,  we  seek  a  domain  tile  which  looks  most  like  it.  The 
computer  determines  how  much  two  tiles  “look  alike”  by  an  affine  least  squares  fit  of 
the  pixel  values,  using  1  of  every  4  pixels  to  accommodate  the  difference  in  domain 
to  range  tile  size.  Domain  tiles  are  checked  in  8  possible  orientations,  corresponding 
to  the  symmetries  of  the  square.  The  transformations  stored  are  then  determined 
by  which  domain  mapped  to  which  rang^,  the  orientation  of  the  domain,  and  the 
affine  transformation  (scaling  and  offset)  on  the  grey  levels  of  the  domain. 

Once  all  range  tiles  have  been  covered  the  image  can  be  reconstructed  in  the 
following  way.  An  arbitrary  initial  image  is  chosen,  for  example  an  image  consisting 
entirely  of  zeros.  The  part  of  the  initial  image  which  corresponds  to  the  domain 
tile  is  copied  to  a  separate  storage  location.  This  subimage  is  oriented  properly,  as 
determined  by  the  transformation.  Its  pixel  levels  are  scaled  and  offset.  It  is  shrunk 
by  taking  only  one  of  every  4  pixels,  so  that  it  shrinks  to  size  8x8.  And  it  is  then 
put  in  the  position  of  the  range  tile  determined  by  the  current  transformation.  The 
domain  tile  is  unaffected.  All  of  the  transformations  are  applied.  This  whc  ’  cycle 
is  repeated  several  times  until  the  resulting  image  remains  stable.  Convergence  is 
a  consequence  of  the  cor'ractive  mapping  fixed  point  theorem,  as  long  as  the  affine 
transformations  are  taken  to  be  contractive  (or  eventually  contractive). 

The  transformations  completely  determine  the  final  image,  independently  of 
the  initial  image  chose.  There  is  no  need  to  store  the  large  scale  detail  separately; 
it  is  created  out  of  the  small  scale  detail  stored  in  the  8x8  tiles.  The  small  scale 
detail  is  then  created  in  turn  by  the  mapping  of  the  large  scale  detail  into  the  8x8 
tiles. 

The  algorithm  is  actually  considerably  more  Sophisticated  than  the  one  outlined 
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above.  The  size  of  the  domain  and  range  tiles  is  not  fixed.  Rather,  it  varies  with 
the  local  image  complexity  in  order  to  allow  fewer  transformations  to  be  used  in 
relatively  flat  portions  of  the  image  and  more  transformations  to  be  used  in  complex 
regions  of  the  image.  Also,  the  search  through  the  domain  tiles  is  not  exhaustive, 
since  this  is  computationally  intensive.  Rather,  the  domains  and  ranges  are  classified 
and  only  the  domains  of  the  same  class  type  as  a  given  range  are  searched  for  a 
good  least  squares  fit. 

Figure  B.l  shows  the  decoding  process.  The  initial  image  is,  in  this  example,  a 
pattern  of  vertical  lines,  chosen  to  show  the  structure  of  the  transformations  after 
one  application  of  all  of  the  maps.  The  figure  contains,  first  from  left  to  right,  top 
then  bottom:  The  initial  image,  one  application  of  all  of  the  transformations,  two 
applications  of  the  transformations,  and  ten  applications  of  the  transformations  by 
which  we  have  converged.  The  initial  image  is  shown  in  figure  B.2.  The  last  image 
in  figure  B.l  can  be  encoded  in  l/10th  the  memory  required  to  store  the  image  in 
figure  B.2,  with  an  rms  error  of  8.59  (a  signal  to  noise  ratio  of  29.5db). 

Table  2  below  gives  several  typical  results  for  several  images.  The  table  specifies 
the  image  size  in  pixels;  the  compression  in  memory  required  to  store  the  original 
image  vs.  the  memory  required  to  store  the  compressed  image;  the  signal  to  noise 
ratio;  the  epu  seconds  of  computation  time  required  to  encode  the  image  on  a 
Convex  C210;  and  the  number  of  transformations  in  the  encoding. 

In  general,  larger  images  yield  better  compression  ratios,  since  there  is  more 
interpixel  correlation.  The  compression  scales  roughly  with  the  image  size,  however, 
so  that  a  512  x  512  image,  which  contains  four  times  as  much  information  as  a 
256  x  256  image,  will  have  roughly  four  times  the  compression  ratio. 


Image 

Size 

Comp 

SNR  (db) 

Time  (sec) 

Lena 

512  x  512 

15.6 

32.1 

899.9 

Lena 

512  x  512 

38.7 

29.2 

- 

Lena 

512  x  512 

23.5 

30.0 

2582.0 

Lena 

256  x  256 

11.3 

28.8 

234.6 

City 

256  x  256 

7.1 

26.6 

530.3 

Collie 

256  x  256 

19.5 

30.2 

1898.9 

Table  2 


Figure  B.3  shows  the  original  256  x  256  pixel  collie  image  and  three  compressed 
versions  (left  to  right,  top  to  bottom)  at  respective  compressions  of  63.0:1  with  a 
signal  to  noise  ratio  of  25.2db,  28.2:1  with  a  signal  to  noise  ratio  of  29.3db,  and 
16.6:1  with  a  signal  to  noise  ratio  of  30.4db.  Figure  B.4  shows  an  encoding  of  a 
512  x  512  original  Lena  at  compression  38.7  wifh  a  signal  to  noise  ratio  of  29.2db. 

The  images  shown  in  appendix  B  have  not  been  postprocessed.  With  postpro¬ 
cessing,  most  of  the  boxy  artifacts  introduced  by  the  compression  scheme  can  be 
eliminated.  Some  more  details  can  be  found  in  appendix  A. 

It  is  important  to  stress  that  the  software  yielding  these  results  has  not  been 
optimized.  It  is  of  a  highly  developmental  nature,  resulting  in  somewhat  less  than 
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optimal  results.  Never  the  less,  the  results  are  comparable  to  other  image  compres¬ 
sion  techniques  (See  section  5.0). 

Section  4.2.  Extra  Results:  Most  Promising  Lines  for  Future  Re¬ 
search. 

This  section  describes  results  which  are  not  directly  related  to  the  Phase  I 
proposal,  in  the  sense  that  they  do  not  demonstrate  the  feasibility  of  “fractal” 
image  compression.  Because  of  the  preliminary  nature  of  the  research,  this  section 
should  be  considered  proprietary. 

The  results  are  all  based  on  various  manipulations  of  contours.  The  schemes 
are: 

•  Transform  encoding  of  contours  (TEC); 

•  Image  encoding  through  level  curves; 

•  Image  encoding  through  leveling  of  edges. 

Section  4.2.1.  Transform  Encoding  of  Contours. 

We  briefly  discuss  the  one  dimensional  analogue  of  the  two  dimensional  tiled 
transform  image  encoding  scheme.  The  image,  in  this  case,  becomes  a  contour,  and 
the  tiles  become  portions  of  the  contour  called  chains.  The  algorithm  is  still  the 
same,  however:  The  contour  is  partitioned  into  long  domain  chains  and  short  range 
chains  (the  particular  lengths  are  not  important  and  can  be  varied  to  alter  the  final 
compression  and  the  fidelity  of  the  reconstructed  contour).  For  each  range  chain,  a 
domain  chain  is  found  which  minimized  the  root  mean  square  distance  between  the 
range  chain  and  an  affine  transformation  of  the  domain  chain.  These  transforma¬ 
tions  are  recorded,  along  with  the  endpoints  of  the  range  chains.  An  approximation 
of  the  original  contour  can  then  be  reconstructed  from  these  transformations  by 
iteratively  applying  all  of  the  transformations.  For  more  details,  see  [4]. 

Section  4.2.2.  Image  Encoding  Through  Level  Curves. 

The  idea  of  using  level  curves  to  encode  an  image  arose  in  conversation  with 
Josh  Deutsch,  of  the  physics  department  at  the  University  of  California,  Santa  Cruz. 
Spencer  Menlove,  of  NETROLOGIC,  also  contributed  to  both  the  implementation 
and  design  of  these  algorithms. 

A  level  curve  is  a  contour  in  the  image  which  has  a  constant  grey  level.  At  sharp 
contrast  points  of  the  image,  the  contour  is  forced  to  pass  through  pixels  which  may 
not  be  of  constant  grey  level  but  which  approximate  the  position  of  the  contour  had 
the  image  been  of  infinite  resolution  and  the  grey  levels  been  continuous.  Level 
curves  are  extracted  at  several  different  grey  values.  These  level  curves  can  be 
stored  compactly,  using,  for  example,  the  transform  encoding  of  contours  method 
described  in  the  previous  section.  To  reconstruct  the  image,  the  grey  levels  of  pixels 
between  contours  are  interpolated. 


This  scheme  is  conceptually  simple,  simple  to  implement,  not  computationally 
intensive,  and  moderately  successful. 

Due  to  time  constraints,  this  scheme  and  the  TEC  scheme  were  not  united,  so 
the  results  reflect  compression  of  the  date  using  a  lossless  standard  technique.  From 
the  experimentation  with  the  TEC  scheme,  we  hope  to  achieve  an  improvement  in 
compression  by  roughly  a  factor  of  2. 

Section  4.2.3.  Image  Encoding  Through  Leveling  of  Edges. 

A  property  of  the  level  curve  encoding  scheme  is  that  it  preserves  edges.  This 
occurs  because  edges  are  at  high  contrast  points  of  the  images  through  which  the 
level  curves  tend  to  pass.  To  capitalize  on  this  observation,  a  similar  scheme  was 
developed  which  encodes  an  image  using  contours  which  run  along  edges  in  the 
image. 

In  this  scheme  the  edges,  or  high  contrast  points,  of  an  image  are  extracted 
and  connected  to  form  contours.  The  grey  level  at  each  edge  contour  is  averaged 
for  each  side  of  the  edge,  and  the  whole  contour  is  assigned  two  values:  one  for  each 
side  of  the  edge.  The  contours  can  be  compactly  encoded,  as  above,  and  the  image 
is  reconstructed  by  interpolating  the  grey  levels  between  the  edge  pixels. 

Due  to  time  constraints,  this  scheme  and  the  TEC  scheme  were  not  united,  so 
the  results  reflect  compression  of  the  date  using  a  lossless  standard  technique.  From 
the  experimentation  with  the  TEC  scheme,  we  hope  to  achieve  an  improvement  in 
compression  by  roughly  a  factor  of  2. 

Section  4.3.  Fractals  and  Wavelets:  Theoretical  Investigations. 

Transform  methods. 

One  efficient  way  of  determining  the  local  frequency  content  of  the  image  is 
through  transforms  related  to  the  Fourier  transform.  These,  in  order  of  increas¬ 
ing  generality,  are  the  Fourier  transform,  the  Wigner  transform,  and  wave  packet 
transforms.  The  Wigner  transform  is  given  by  the  integral 

W(f)  =  J  e-2’«o/(x  +  ip)/(x  -  ip)dp, 

and  a  generalized  wave  packet  transform  is  given  by  the  integral 

P/(p,q,i)  =  J  e-^+^P Hll26[tll2(x  -  q)]/(x)tfp 

where  <f>(x)  is  a  generalized  Gaussian  (see  below).  A  little  consideration  of  the 
definition  of  W(/)(x,£)  will  show  that  this  transform  gives  the  local  frequency 
content  of  the  function  /  in  a  neighborhood  of  x.  The  expression  for  P^(p,  q,f)  is 


the  inverse  Fourier  transform  for  W(f)  if  we  set  t  =  1  and  <j>  —  f.  Details  may  be 
found  in  Folland’s  1989  monograph,  pp56-63  and  142-169. 

Wavelet  transform  methods. 

We  investigated  relationships  between  fractal  methods  and  wavelet  transforms. 
The  objective  of  this  phase  of  the  study  was  to  look  for  ways  to  improve  the  choice 
of  affine  iterated  function  systems.  One  rationale  for  doing  this  is  that  our  current 
image  compression  scheme  does  not  preserve  edges  well.  Because  edges  contribute 
heavily  to  human  perception  of  image  quality,  it  is  important  to  amend  this  weak¬ 
ness.  An  appropriate  wavelet  transform  yields  a  powerful  tool  to  extract  edge 
information  from  an  image  on  a  variety  of  scale  lengths.  Figure  B.5  shows  edges 
extracted  from  a  wavelet  transform  of  ’’Lena”  at  different  scale  lengths.  Edges  cor¬ 
respond  to  zero  crossings  of  a  wavelet  transform  of  an  image  and  the  image  may  be 
reconstructed,  by  standard  methods,  from  such  zero  crossings. 

Edge  information  may  be  used  in  conjunction  with  fractal  methods  in  various 
ways.  One  approach  is  to  force  the  coding  method  to  preserve  edges  by  using  a 
error  measure  which  weights  the  edge  error  more  heavily.  (Recall  that  our  encod¬ 
ing  scheme  chooses  transforms  on  the  basis  of  an  integrated  error.)  An  alternative 
approach  is  to  regard  the  edges  as  one-dimensional  fractal  curves.  The  edges  them¬ 
selves  may  be  coded  as  iterated  transforms,  using  a  1-dimensional  analog  of  the 
2-dimensional  method  used  for  images,  as  has  been  detailed  above. 

Generalized  Gaussian  functions  give  one  class  of  wavelet  transforms.  In  order  to 
clarify  the  relationship  between  affine  fractals  and  this  particular  wavelet  transform, 
we  consider  the  behavior  of  the  generalized  Gaussian  under  affine  transformations. 
A  generalized  Gaussian  is  given  by  a  function: 

<f>A,b,d(x)  ~  exp(xt  Ax  +  b‘x  +  d ), 

where  x  is  a  vector  in  the  plane,  A  is  a  symmetric,  negative  definite  matrix,  and  b  is 
a  fixed  vector.  There  is  also  a  complex-valued  form,  in  which  the  entries  of  A  and  b 
are  complex  numbers  and  the  real  part  of  A  is  required  to  be  negative  definite.  The 
set  of  generalized  Gaussians  is  preserved  under  non-singular  affine  transformations 
and  the  orbits  (up  to  a  complex  factor)  are  specified  by  the  cogredience  class  of  A. 
According  to  a  theorem  of  Sylvester,  any  two  reeil  symmetric  matrices  are  cogredient 
if  they  have  the  same  rank  and  the  same  signature  (Jacobson,  1953).  The  closure 
property  may  be  seen  by  a  simple  calculation.  If  x  Bx  +  c  is  an  affine  transform 
A,  then  ^(x)  is  transformed  into 

^(^a,m)(Bx  +  c)  =  exp(xtBt  ABx  +  c‘Ax  +  x*Ac  +  b*Bx  +  c*  Ac  +  bfcc  +  d). 

The  latter  function  is  of  the  form  where  A'  =  BtAB,  b  =  2cfcA  +  b4B  and 

d1  =  c^Ac  +  b*c  +  d.  Because  A  is  symmetric  and  negative  definite  the  equation 

2cfcA  +  b‘B  =  e‘ 
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has  a  unique  solution  c.  Therefore  by  proper  choice  of  affine  transforms,  the 
quadratic  part  x4Ax  is  specified  up  to  cogredience  class.  The  linear  part  bfcx 
is  arbitrary,  and  we  are  left  with  a  constant  factor  which  cannot  be  specified. 

Future  work  should  clarify  the  relationship  between  image  coding  by  wavelets 
and  image  coding  by  iterated  function  systems.  The  invariance  of  the  set  of  general¬ 
ized  Gaussian  functions  under  affine  transformations  indicates  that  the  analog  of  a 
fractal  should  be  a  sum  of  Gaussians  which  is  preserved  under  an  iterated  function 
system.  In  particular,  if 

t 

then  we  look  for  a  system  of  affine  transformations  {Ai,  A2, . . . ,  A^}  with  weights 
{tui,u>2, •  •  •  i»w}  such  that 


N 


f  =  £u>i(A0  1(f)  =  £}<fiA>lb'ld'r 


1=1 


Because  we  require 

l^AJ.b  Udi 

i  i 

the  summations  need  not  be  unique. 

Our  survey  of  the  literature  has  revealed  some  other  connections  between  gen¬ 
eralized  harmonic  analysis  and  affine  transformations,  in  particular  through  rep¬ 
resentations  of  the  extended  metaplectic  representation  and  the  inhomogeneous 
symplectic  group.  The  metaplectic  representation  is  a  representation  of  the  sym- 
plectic  group  (the  group  of  2 n  x  2 n  matrices  which  preserve  the  symplectic  form 
[(P?  q)>  (p,)  q')]  =  Pq*  —  Pfq>  on  vectors  with  2n  components.  The  symplectic  group 
is  generated  by  matrices 


The  action  is  given,  up  to  sign,  by 


/(*)  = 


/(A  xx) 


•y/det(A) 
f(x)  =  ±e“«°V(x) 


/(x)  =  i~n/2F~lf{x), 


where  F  is  the  Fourier  transform.  The  extended  metaplectic  representation  is  a 
representation  of  the  semidirect  product  of  the  Heisenberg  and  symplectic  groups. 
The  semidirect  product  is  given  as  pairs  of  operators  (X,  A),  where  X  is  in  the 


Heisenberg  group  and  A  is  in  the  symplectic  group.  The  group  product  is  given 
by  (X,A)(X\A')  =  (X(AXI),AAI),  and  the  representation  is  given  by  w(X,  A)  — 
r(X)m(A).  The  Heisenberg  group  acts  as  follows  for  X  =  (p,  q,<): 

Xf(x)  =  e2Kte2**x+*p*f(x  +  p). 

Because  the  product  in  the  Heisenberg  group  is  given  by 

(p»  q>  0(p'»  q'.  0  =  (p  +  p'>  q  +  q',  *  +  *'  +  |(pq'  -  qp'))» 

it  can  be  seen  that  the  extended  metaplectic  representation  contains  the  usual  action 
of  the  affine  group. 

The  inhomogeneous  symplectic  group  is  another  extension  of  the  symplectic 
group,  which  has  a  more  obvious  relationship  to  the  affine  group.  In  this  case  the 
semidirect  product  is  the  product  of  R2n  with  the  symplectic  group,  and  the  group 
multiplication  is 

(u>,  A)(w\  A1)  =  (w  +  Aw',  AA'). 

The  representation,  as  before,  is  given  by  w(X ,  A)  =  r(X)m(A).  This  is  a  projective 
representation,  and  the  extension  of  this  representation  by  a  representation  of  the 
circle  group  is  the  extended  metaplectic  representation. 

Other  connections  between  generalized  Gaussians  and  affine  transforma¬ 
tions. 

It  is  easy  to  show  that  the  set  of  generalized  Gaussians  is  preserved  under  the 
convolution  product.  This  may  be  derived  from  results  on  the  oscillator  semigroup 
(Folland,  page  231)  or  calculated  directly.  Furthermore,  affine  transformations  pre¬ 
serve  the  convolution  product,  up  to  a  constant  factor.  If  we  write 


7[A,x0,c]  «  exp[(x  -  xo)* A(x  -  x0)  +  c] 


where,  as  before  the  real  part  of  A  is  negative  definite,  then 


•^7[A,Xo,c]  =  7[M*  AM,M-1  (x0— b),c]j 
where  M(x )  =  M(x)  +  b,  and 

7[A,x0,c]  *7[B,yo,c)  =  7[At(A+B)-lA+A,xo+yo,c'] 

where  c'  =  c  —  |  log(det(A  +  B))  and  *  represents  the  convolution  product. 

This  may  be  useful  in  shortening  the  search  for  affine  transformations  which 
map  one  part  of  the  image  into  another.  In  particular,  we  may  be  able  to  use  a 
wavelet  decomposition  and  the  expression  for  the  convolution  to  comute  the  best 
correlation  between  regions  of  the  image. 


Affine  transformations  and  theta  functions. 


One  definition  for  the  theta  function  is  given  via  a  generalization  of  the  Fourier 
transform.  We  let  \x  be  a  vector  in  Rk  and  jx  stand  for  a  vector  in  the  space  jRk 
of  homogeneous  polynomials  of  degree  j  in  Rk.  There  is  an  obvious  inner  product 
in  this  space,  so  we  can  define  the  nonlinear  Fourier  transform 


F^r1,.  =  J /(ix)exp[tr1  •!  x, . . .  irJ  •  j  x]dx. 


If  we  take  /  to  be  where  in  is  the  lattice  of  vectors  with  integer  coor¬ 

dinates  in  Rk  and  J  =  2,  then  we  get  the  function  ^(x,  z).  Furthermore, 

J  /(M,x)  expfir1  *i  x, ... ,  irJ  •  j  x]dx 


det(  M) 


J /(1x)exp[ir1  •  M  1  x)jx]dx 


=  //(lX)exp{r'[M  ix, . . . , i[j(M  1)]*rJvx}dx 


det(  M) 


where  *  denotes  the  adjoint  and  the  pre-subscript  denotes  the  appropriate  symmet¬ 
ric  product.  This  shows  that  approximate  symmetries  under  the  affine  group  of  the 
function  /  can  be  represented  as  approximate  symmetries  of  the  affine  group  acting 
on  the  nonlinear  Fourier  transform  of  /.  This  indicates  that  searching  zero- crossings 
should  be  a  good  way  to  find  appropriate  transforms  to  code  an  image. 


Section  5.0.  Comparison  With  Other  Image  Compression  Schemes. 

This  section  contains  some  results  from  other  image  compression  schemes.  This 
section  is  by  no  means  complete;  the  omission  or  inclusion  of  any  results  is  happen¬ 
stance.  The  following  graph  shows  results  from  a  collection  of  recent  publications, 
listed  below. 
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Figure  4.  Signal  to  noise  ratio  vs.  Compression  from 
a  variety  of  recent  publications  and  from  the  TTIE  scheme. 

The  figure  is  somewhat  misleading  in  that  not  all  of  the  images  compressed  are 
the  same,  or  even  the  same  size.  The  points  are  segragated  into  image  size,  results 
from  the  literature  and  our  results.  The  high  fidelity,  low  compression  results  come 
from  a  paper  which  used  particularly  simple  images  to  encode.  Nevertheless,  the 
TTIE  scheme  compares  well  with  other  results. 

Table  3  tabulates  the  results  in  the  image  and  gives  the  references  to  the  gath¬ 
ered  data.  The  references  for  the  gathered  data  follow: 

[1]  Analysis/Synthesis  for  Subband  Image  Coding,  Smith,  Eddins,  IEEE  Trans. 
Speech  Acoustics  and  Signal  Processing,  Vol.  38,  No.  8,  Aug  1990. 

[2]  Adaptive  Cosine  Transform  Coding  of  Images,  Ngan,  Leong,  Singh,  IEEE 
TVans.  Speech  Acoustics  and  Signal  Processing,  Vol.  37,  No.  11,  Nov  1989. 

[3]  Image  Coding  -  From  Waveforms  to  Animation,  R.  Forchheimer,  T.  Kronander, 
IEEE  Trans.  Speech  Acoustics  and  Signal  Processing,  Vol.  37,  No.  12,  Nov 
1989. 

[4]  Subband  Image  Coding,  Woods  and  Oneil,  IEEE  Trans.  Speech  Acoustics  and 
Signal  Processing,  Vol.  34,  No.  10,  Oct  1986. 

[5]  Non  Linear  Space-Carient  Postprocessing  of  Block  Coded  Images,  Rmammuthi, 
Gersho,  IEEE  Trans.  Speech  Acoustics  and  Signal  Processing,  Vol.  34,  No.  5, 
May  1986. 
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[6]  Sliding  Block  Entropy  Encoding  of  Images,  Cohen  and  Woods,  M7.1  IEEE, 
1989. 

[7]  Pruned  Tree-Structured  Vector  Quantization  in  Image  Coding,  Riskin,  Daly 
and  Gray,  M7.2  IEEE,  1989. 


Reference 

Compression 

SNR  (db) 

Image  Size 

Image 

1 

10 

30.7 

256 

simple 

1 

10 

31.3 

256 

simple 

1 

5 

35.1 

256 

simple 

1 

5 

35.6 

256 

simple 

1 

15 

28.7 

256 

simple 

1 

15 

29.5 

256 

simple 

2 

20 

33.5 

512 

Lena 

2 

10 

37.0 

512 

Lena 

3 

9 

24.3 

256 

Lena 

4 

16 

29.6 

? 

? 

4 

8 

33.0 

? 

? 

4 

4 

36.5 

? 

? 

5 

11.4 

29.9 

512 

Lena 

6 

8 

34.8 

512 

Lena 

6 

17.7 

32.5 

512 

Lena 

6 

14.3 

31.1 

512 

Lena 

7 

25.6 

29.2 

512 

Lena 

7 

25.6 

30.92 

512 

Lena 

7 

15.68 

32.43 

512 

Lena 

TTIE 

7.5 

30.0 

256 

city 

TTIE 

19.54 

30.28 

256 

collie 

TTIE 

11.8 

31.96 

256 

collie 

TTIE 

Ij.I 

28.48 

256 

collie 

TTIE 

18.3 

29.4 

256 

colie 

TTIE 

13.1 

32.4 

256 

girl 

TTIE 

12 

28.9 

256 

Lena 

TTIE 

38.7 

29.2 

512 

Lena 

TTIE 

15.91 

32.1 

512 

Lena 

TTIE 

23.5 

30.0 

512 

Lena 

Table  3 


Section  6.0.  Conclusion. 

The  tiled  transform  image  encoding  scheme  yields  good  results  which  axe  com¬ 
parable  to  the  latest  results  attainable  by  other  schemes.  It  is  sufficiently  mature  to 
be  implemented  in  hardware,  and  this  should  be  one  of  the  next  goals  of  pursuing 
“fractal”  image  encoding  further.  The  research  into  the  subject,  however,  should 
not  be  considered  complete.  The  alternative  schemes  presented  in  the  paper  should 


be  followed  to  see  if  they  yield  even  better  results.  Given  that  the  TTIE  scheme  was 
only  explored  for  a  short  time,  there  is  good  reason  to  expect  that  further  research 
will  lead  to  such  results. 

Finally,  the  mathematical  modeling  of  the  underlying  processes  should  be  ex¬ 
tended.  As  we  demonstrated,  the  current  level  of  understanding  of  the  scheme  is 
rather  incomplete.  Basing  a  scheme  on  “motivational”  arguments  cannot  be  ex¬ 
pected  to  provide  optimal  result -j.  A  thorough  research  effort  into  building  a  good 
model  of  the  process  should  lead  to  much  better  results. 

Section  7.0.  Recommendations. 

We  recommend  further  research  and  development  along  several  lines.  First,  the 
tiling  scheme  is  sufficiently  mature  to  pursue  hardware  implementation  for  possible 
real  time  applications.  Second,  we  expect  that  the  application  of  fractal  techniques 
with  the  other  image  processing  methods  presented,  such  as  contour  detection,  will 
lead  to  even  better  algorithms.  This  avenue  of  research  should  be  developed.  Fi¬ 
nally,  research  into  the  mathematical  foundations  of  the  subject  has  barely  begun. 
We  believe  that  a  program  which  integrates  hardware  engineering,  software  devel¬ 
opment  and  mathematical  research  will  yield  the  best  results  in  the  long  tun. 

More  specific  recommendations  will  be  found  in  our  followup  Phase  II  proposal. 
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Appendix  A:  Other  Work 

In  this  appendix  we  describe  other  work  carried  on  during  the  phase  I  research 
which  is  of  secondary  interest.  The  section  is  organized  into  small  subsections  which 
briefly  describe  the  work  and  results. 

Lossless  Compression. 

While  the  compression  ratios  attainable  with  lossless  compression  schemes  are 
far  lower  than  those  obtained  with  lossy  schemes,  the  utility  of  lossless  schemes  is 
higher  for  many  applications.  We  investigated  briefly  the  idea  of  compressing  the 
difference  between  an  original  image  and  a  highly  compressed  version  of  the  image. 
The  results  were  encouraging,  though  not  spectacular.  Compression  of  our  standard 
test  images  ranged  from  1.3  to  1.9.  Further  work  along  these  lines  is  warranted,  in 
view  of  the  importance  of  lossless  compression  schemes. 

Fourier  Methods. 

We  investigated  the  use  of  FFT  methods  in  conjunction  with  the  TTIE  scheme. 
The  results  were  uniformly  poor.  The  frequency  domain  does  not  display  the  type 
of  coherency  or  adjacent  pixel  correlation  which  the  TTIE  scheme  can  capitalize  on 
in  order  to  achieve  good  compressions. 

Alternative  Tile  Classification  Schemes. 

Using  alternative  classification  schemes  for  tiles  is  somewhat  technical.  Refer¬ 
ence  [1]  has  further  details.  The  domain  tiles  used  to  encode  an  image  are  classified 
in  order  to  speed  the  search  needed  to  find  a  “good”  domain  tile.  Using  a  classifica¬ 
tion  scheme  generally  results  in  poorer  fidelity,  because  there  is  no  guarantee  that 
the  optimal  domain  will  be  found  in  the  class  searched.  We  investigated  several 
methods  based  on  correlation  methods  and  moments.  The  results  are  not  defini¬ 
tive,  being  better  in  some  features  and  worse  in  others  as  compared  with  the  present 
scheme. 

Elimination  of  Compression  Artifacts. 

One  weakness  of  the  current  compression  scheme  is  the  appearance  of  artifacts. 
In  order  to  eliminate  these  we  attempted  to  postprocess  the  image  by  smoothing 
along  the  boundaries  of  the  range  tiles.  This  was  successful,  often  resulting  in  an 
even  lower  rms  error.  Of  the  attempts  to  remove  compression  artifacts,  this  was  the 
most  successful.  This  type  of  postprocessing  can,  and  should,  be  utilized  in  any  final 
implementation  of  TTIE  scheme.  While  the  TTIE  scheme  is  still  in  development, 
we  felt  it  was  better  to  concentrate  on  optimizing  the  results  obtainable  with  the 
method,  rather  than  attempt  to  do  a  good  job  cleaning  up  afterwards. 

Image  Preprocessing. 

In  an  effort  to  improve  the  ability  of  the  encoding  scheme  in  encoding  images, 
we  investigated  two  image  preprocessing  methods.  The  first,  based  on  Wigner 
transforms  was  disappointing.  This  transform  lead  to  data  which  was  complicated. 
The  second  method  using  a  wavelet  transform,  is  potentially  useful.  We  used  this 


transform  as  a  method  of  edge  detection,  though  we  had  hoped  to  find  a  deeper 
relationship  between  it  and  the  affine  transformations  used  to  encode  images  (see 
section  4.3).  This  research  did  not  lead  to  any  concrete  results. 

Alternative  Tiling  Methods. 

We  attempted  to  improve  the  fidelity  of  a  given  tiling  by  using  linear  combina¬ 
tions  of  domain  tiles.  Initial  results  were  very  encouraging,  having  greatly  reduced 
artifacts.  Using  more  than  one  domain  tile  significantly  reduced  the  encoding  error, 
but  decreased  the  overall  compression.  Without  careful  classification,  the  search 
time  needed  to  find  even  a  moderate  approximation  to  an  optimal  fit  is  prohibitive. 
Our  last  approach  to  using  several  domain  tiles  in  an  encoding  was  to  encode  a  range 
optimally  and  then  encode  the  resulting  error.  Due  to  time  constraints,  we  did  not 
carefully  study  the  relationship  between  the  encoding  fidelity  and  the  compression 
ratio  as  compared  with  the  TTIE  scheme.  Such  a  study  is  warranted. 


Appendix  B:  Images 
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Figure  B .  1 


The  decoding  process  starting  with  an  initial  image  containing 
a  pattern.  The  images  show  one,  two  and  ten  application  of  all 
of  the  encoding  transformations.  The  lower  right  image  is 
reconstructed  from  a  10.0:1  compression  of  figure  B.2  at 
a  signal  to  noise  ratio  of  29.5db. 


#  Figure  B  .2 

The  original  Lena  at  size  256x256. 


Figure  B.3 

The  original  collie  at  size  256x256  8bpp  (upper  left).  Collie  compressed 
63.0:1  with  a  signal  to  noise  ratio  of  25.2db  (upper  right).  Collie 
compressed  by  a  factor  of  28.2  with  a  signal  to  noise  ratio  of  29.3db 
(lower  left).  Collie  compressed  at  16.6:1  at  30.1db  snr  (lower  right). 
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Figure  B  .4 

A  reconstruction  of  a  512x512  original  version  of  Lena  compressed 
at  38.7:1  with  a  signal  to  noise  ratio  of  29.2db. 


Figure  B.5 

A  wavelet  transform  of  a  256  x  256  version  of  Lena 
resulting  in  the  extraction  of  edge  information. 


Figure  B.6 

An  original  map  of  Asia  (left)  and  a  fractally  encoding  of 
that  map.  The  resulting  compression  depends  on  the  storage  method 
of  the  original  map.  For  very  efficient  schemes,  the  compression  is 
still  near  1  at  this  preliminary  stage  of  development  of  the  algorithm. 


Figure  B.7 

Contour  encoding  of  images.  Lena  encoded  with  9  (upper)  and  15  (lower) 
contours  with  respective  compressions  of  5.9  and  4.2  and  signal 
to  noise  ratios  of  26.3db  and  26.5db.  We  expect  to  increase  the 
compression  once  the  contours  are  encoded  as  transformations. 

The  contours  are  shown  on  the  right  of  each  image. 


#  Figure  B, 8 


Edge  contour  encoding  of  images.  Lena  encoded  with  two  different 
edge  sensitivities.  (20  upper  and  10  lower).  The  respective 
compressions  are  5.7  and  3.0  with  signal  to  noise  ratios  of 
22.5db  and  27.4db.  The  signal  to  noise  ratio  does  not  reflect 
the  true  image  fidelity,  since  the  edges  are  very  well  preserved  by 
this  scheme.  We  expect  to  increase  the  compression  once  the  edge 
contours  are  encoded  as  transformations.  The  edge  contours  are  shown 
on  the  right  of  each  image. 


Figure^B.9 


A  modified  Edge  contour  encoding  of  images.  Lena  encoded  with  two 
different  edge  sensitivities.  (20  upper  and  10  lower).  The  respective 
compressions  are  7.2  and  3.5  with  signal  to  noise  ratios  of 
21.6db  and  24.4db.  The  signal  to  noke  ratio  does  not  reflect 
the  true  image  fidelity,  since  the  ad*s  are  very  well  preserved  by 
this  scheme.  We  expect  to  increlse  the  compression  once  the  edge 
contours  are  encoded  as  transformations.  The  edge  contours  are  shown 
on  the  right  of  each  imasie. 


